Skip to content

Add TelemetryConfigRef support to MCPRemoteProxy#4719

Merged
ChrisJBurns merged 8 commits intomainfrom
chrisburns/mcpremoteproxy-telemetry-config-ref
Apr 13, 2026
Merged

Add TelemetryConfigRef support to MCPRemoteProxy#4719
ChrisJBurns merged 8 commits intomainfrom
chrisburns/mcpremoteproxy-telemetry-config-ref

Conversation

@ChrisJBurns
Copy link
Copy Markdown
Collaborator

Summary

MCPRemoteProxy lacked the ability to reference shared MCPTelemetryConfig resources, forcing users to duplicate inline telemetry configuration across every proxy instance. This brings MCPRemoteProxy to parity with MCPServer's telemetry API by adding TelemetryConfigRef support.

Closes #4620

Type of change

  • New feature

Changes

File Change
cmd/thv-operator/api/v1alpha1/mcpremoteproxy_types.go Add TelemetryConfigRef field with CEL mutual exclusivity validation, deprecate inline Telemetry, add TelemetryConfigHash status field and condition constants
cmd/thv-operator/pkg/controllerutil/config.go Add GetTelemetryConfigForMCPRemoteProxy() helper — namespace-scoped fetch returning (nil, nil) for not-found
cmd/thv-operator/controllers/mcpremoteproxy_controller.go Add handleTelemetryConfig() handler, call in validateAndHandleConfigs(), add mapTelemetryConfigToMCPRemoteProxy() watch mapper, register MCPTelemetryConfig watch in SetupWithManager()
cmd/thv-operator/controllers/mcpremoteproxy_runconfig.go Prefer TelemetryConfigRef over inline Telemetry in RunConfig; extract addTelemetryOptions and resolveToolConfig helpers to fix cyclomatic complexity

Test plan

  • task lint-fix passes (0 issues)
  • task test passes (all operator unit tests)
  • go build ./... passes
  • New unit tests: TestGetTelemetryConfigForMCPRemoteProxy (4 cases), TestHandleTelemetryConfig_MCPRemoteProxy (5 cases), TestMapTelemetryConfigToMCPRemoteProxy
  • Existing TestCreateRunConfigFromMCPRemoteProxy tests still pass with updated ctx parameter

Special notes for reviewers

  • The resolveToolConfig extraction in mcpremoteproxy_runconfig.go is a refactor needed to keep cyclomatic complexity under the lint threshold (was already at 15 pre-change). No behavioral change.
  • CEL validation is tested via the generated CRD manifest; integration test coverage for MCPRemoteProxy CEL rules can follow in a separate PR.
  • VirtualMCPServer telemetry config ref is out of scope (separate issue).

Generated with Claude Code

ChrisJBurns and others added 6 commits April 9, 2026 23:56
Brings MCPRemoteProxy to parity with MCPServer's telemetry API by adding
a TelemetryConfigRef field for referencing shared MCPTelemetryConfig
resources. Includes CEL mutual exclusivity validation with the deprecated
inline Telemetry field, TelemetryConfigHash in status for change
detection, and condition type/reason constants.

Part of #4620

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Namespace-scoped fetch for MCPTelemetryConfig referenced by an
MCPRemoteProxy. Returns (nil, nil) when the ref is nil or the resource
is not found, matching the MCPServer getTelemetryConfigForMCPServer
contract.

Part of #4620

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…roxy

Wire up the reconciler to validate referenced MCPTelemetryConfig
resources, track config hashes in status, and reconcile when the
underlying MCPTelemetryConfig changes. Follows the same handler pattern
as handleToolConfig and the MCPServer telemetry handler.

Part of #4620

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Config

When building the RunConfig, resolve telemetry from the referenced
MCPTelemetryConfig first and fall back to the deprecated inline
Telemetry field. Adds ctx parameter to createRunConfigFromMCPRemoteProxy
to support the API fetch, matching the MCPServer runconfig pattern.

Part of #4620

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-generated from MCPRemoteProxy type changes: adds
telemetryConfigRef to CRD schema with CEL mutual exclusivity validation,
telemetryConfigHash to status, and updated API reference docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract resolveToolConfig and addTelemetryOptions from
createRunConfigFromMCPRemoteProxy to bring cyclomatic complexity below
the threshold. Rename shadowed ctx to apiCtx for clarity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the size/L Large PR: 600-999 lines changed label Apr 9, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 52.26131% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.94%. Comparing base (d851c69) to head (0cce921).
⚠️ Report is 35 commits behind head on main.

Files with missing lines Patch % Lines
...rator/controllers/mcptelemetryconfig_controller.go 13.33% 35 Missing and 4 partials ⚠️
...-operator/controllers/mcpremoteproxy_controller.go 68.88% 22 Missing and 6 partials ⚠️
...v-operator/controllers/mcpremoteproxy_runconfig.go 53.84% 12 Missing and 6 partials ⚠️
...-operator/controllers/mcpremoteproxy_deployment.go 33.33% 8 Missing ⚠️
cmd/thv-operator/pkg/controllerutil/config.go 84.61% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4719      +/-   ##
==========================================
+ Coverage   68.71%   68.94%   +0.23%     
==========================================
  Files         517      518       +1     
  Lines       54817    54921     +104     
==========================================
+ Hits        37666    37868     +202     
+ Misses      14252    14142     -110     
- Partials     2899     2911      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review from automated analysis. Five findings — three require changes in files not touched by this PR (prose below), two have inline comments on the diff.


[HIGH] Missing +kubebuilder:rbac annotation for mcptelemetryconfigs

File: cmd/thv-operator/controllers/mcpremoteproxy_controller.go (around line 48–51)

The MCPRemoteProxyReconciler now lists and gets MCPTelemetryConfig objects (in mapTelemetryConfigToMCPRemoteProxy and via GetTelemetryConfigForMCPRemoteProxy), but there is no +kubebuilder:rbac marker for mcptelemetryconfigs. task operator-manifests generates the ClusterRole from these markers — without this one, the controller gets permission-denied errors at runtime the first time any proxy uses TelemetryConfigRef.

MCPServerReconciler already has:

// +kubebuilder:rbac:groups=toolhive.stacklok.dev,resources=mcptelemetryconfigs,verbs=get;list;watch

Add the same line to MCPRemoteProxyReconciler alongside the existing mcptoolconfigs/mcpoidcconfigs markers.


[HIGH] MCPTelemetryConfigReconciler.findReferencingWorkloads does not scan MCPRemoteProxy

File: cmd/thv-operator/controllers/mcptelemetryconfig_controller.go line 259 (not changed in this PR)

findReferencingWorkloads only calls FindWorkloadRefsFromMCPServers. It never looks at MCPRemoteProxy, so:

  1. Deletion protection is broken: an operator can delete an MCPTelemetryConfig referenced only by a proxy — the finalizer check passes.
  2. status.referencingWorkloads will never include proxy names.
  3. The reverse-direction watch (proxy changes → re-evaluate MCPTelemetryConfig) is also missing from SetupWithManager.

MCPOIDCConfigReconciler and MCPExternalAuthConfigReconciler were both extended to include MCPRemoteProxy when those ref fields were added. The same pattern needs to be applied here.


[HIGH] Missing CA bundle volume mount for TelemetryConfigRef in proxy deployment

File: cmd/thv-operator/controllers/mcpremoteproxy_deployment.go (buildVolumesForProxy, not changed in this PR)

addTelemetryOptions correctly calls ctrlutil.TelemetryCABundleFilePath(telCfg) and embeds the path in the RunConfig. At pod startup the proxy runner opens that path via os.ReadFile to build a custom TLS config for the OTLP collector.

However, buildVolumesForProxy never calls ctrlutil.AddTelemetryCABundleVolumes. The ConfigMap is never mounted into the pod. Any user who sets spec.openTelemetry.caBundleRef in their MCPTelemetryConfig will get a file-not-found error at proxy startup.

MCPServerReconciler.deploymentForMCPServer already handles this correctly — the same block needs to be added to deploymentForMCPRemoteProxy.

JAORMX
JAORMX previously approved these changes Apr 10, 2026
Five fixes from code review:

1. Add +kubebuilder:rbac marker for mcptelemetryconfigs on
   MCPRemoteProxyReconciler — required for runtime API access

2. Extend MCPTelemetryConfigReconciler.findReferencingWorkloads to scan
   MCPRemoteProxy in addition to MCPServer, fixing deletion protection
   and referencingWorkloads status. Add MCPRemoteProxy watch to
   SetupWithManager for the reverse-direction reconciliation

3. Mount telemetry CA bundle volumes in proxy deployment via
   addTelemetryCABundleVolumes — without this, caBundleRef users get
   file-not-found at proxy startup

4. Fix condition removal not persisted when TelemetryConfigHash is
   already empty — check for stale condition before deciding whether
   to call Status().Update()

5. Fix test assertions to re-fetch persisted state from fakeClient
   instead of reading in-memory pointers. Add recovery-from-False and
   stale-condition-removal test cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 13, 2026
@ChrisJBurns
Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough review @jhrozek — all five findings are addressed in 09a948e.

[HIGH] Missing +kubebuilder:rbac annotation

Added // +kubebuilder:rbac:groups=toolhive.stacklok.dev,resources=mcptelemetryconfigs,verbs=get;list;watch to MCPRemoteProxyReconciler alongside the existing mcptoolconfigs/mcpoidcconfigs markers. The generated ClusterRole already had this permission (from the MCPTelemetryConfigReconciler), but the marker is needed for correctness and documentation.

[HIGH] findReferencingWorkloads does not scan MCPRemoteProxy

Extended findReferencingWorkloads to call FindReferencingMCPRemoteProxies and merge the results with MCPServer refs. Also added mapMCPRemoteProxyToTelemetryConfig and registered it as a watch in SetupWithManager, following the same pattern as MCPExternalAuthConfigReconciler and MCPOIDCConfigReconciler. This fixes deletion protection and status.referencingWorkloads for proxy-only references.

[HIGH] Missing CA bundle volume mount

Added addTelemetryCABundleVolumes helper to mcpremoteproxy_deployment.go and called it from deploymentForMCPRemoteProxy right after buildVolumesForProxy. Follows the exact pattern from MCPServerReconciler.deploymentForMCPServer (line 1220).

Two new integration tests in the mcp-telemetry-config suite:

1. MCPRemoteProxy tracked in ReferencingWorkloads — creates a proxy
   referencing an MCPTelemetryConfig and asserts the proxy appears in
   status.referencingWorkloads via the MCPRemoteProxy watch handler.

2. Deletion protection for proxy-only references — verifies that the
   finalizer blocks deletion while an MCPRemoteProxy references the
   config, and allows deletion after the proxy is removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 13, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All five review findings have been addressed in the latest push:

  1. RBAC annotation+kubebuilder:rbac for mcptelemetryconfigs added to MCPRemoteProxyReconciler
  2. findReferencingWorkloads — now scans MCPRemoteProxy via FindReferencingMCPRemoteProxies, plus Watches and mapper added to MCPTelemetryConfigReconciler.SetupWithManager
  3. CA bundle volume mount — new addTelemetryCABundleVolumes method wires up the ConfigMap volume in the proxy deployment
  4. Test assertions — tests now re-fetch persisted state via fakeClient.Get instead of asserting on in-memory pointers
  5. nil-ref condition persistencecondRemoved || hash != "" guard ensures the status update fires even when the hash was already empty

New recovery-from-False and stale-condition-removal test cases also added. Integration tests extended for the telemetry config controller. Looks good.

@ChrisJBurns ChrisJBurns merged commit 691f73a into main Apr 13, 2026
41 checks passed
@ChrisJBurns ChrisJBurns deleted the chrisburns/mcpremoteproxy-telemetry-config-ref branch April 13, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TelemetryConfigRef support to MCPRemoteProxy

3 participants